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Abstract — An information collection problem in a wireless net- 
work with random events is considered. Wireless devices report 
on each event using one of multiple reporting formats. Each 
format has a different quality and uses different data lengths. 
Delivering all data in the highest quality format can overload sys- 
tem resources. The goal is to make intelligent format selection and 
routing decisions to maximize time-averaged information quality 
subject to network stability. Lyapunov optimization theory can be 
used to solve such a problem by repeatedly minimizing the linear 
terms of a quadratic drift-plus-penalty expression. To reduce 
delays, this paper proposes a novel extension of this technique 
that preserves the quadratic nature of the drift minimization 
while maintaining a fully separable structure. In addition, to 
avoid high queuing delay, paths are restricted to at most two 
hops. The resulting algorithm can push average information 
quality arbitrarily close to optimum, with a trade-off in queue 
backlog. The algorithm compares favorably to the basic drift- 
plus-penalty scheme in terms of backlog and delay. Furthermore, 
the technique is generalized to solve linear programs and yields 
smoother results than the standard drift-plus-penalty scheme. 



I. Introduction 

This paper investigates dynamic scheduling and data format 
selection in a network where multiple wireless devices, such 
as smart phones, report information to a receiver station. 
The devices together act as a pervasive pool of information 
about the network environment. Such scenarios have been 
recently considered, for example, in applications of social 
sensing and personal environment monitoring 0, 0. 
Sending all information in the highest quality format can 
quickly overload network resources. Thus, it is often more 
important to optimize the quality of information, as defined 
by an end-user, rather than the raw number of bits that are 
sent. The case for quality-aware networking is made in 0, 
0, Q. Network management with quality of information 
awareness for wireless sensor networks is considered in 0. 
More recently, quality metrics of accuracy and credibility are 
considered in 0, [ 1 1 using simplified models that do not 
consider the actual dynamics of a wireless network. 

In this paper, we extend the quality-aware format selection 
problem in iflOl to a dynamic network setting. We particularly 
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focus on distributed algorithms for routing, scheduling, and 
format selection that jointly optimize quality of information. 
Specifically, we assume that random events occur over time 
in the network environment, and these can be sensed by 
one or more of the wireless devices, perhaps at different 
sensing qualities. At the transport layer, each device selects 
one of multiple reporting formats, such as a video clip at 
one of several resolution options, an audio clip, or a text 
message. Information quality depends on the selected format. 
For example, higher quality formats use messages with larger 
bit lengths. The resulting bits are handed to the network layer 
at each device and must be delivered to the receiver station 
over possibly time-varying channels. This delivery can be a 
direct transmission from a device to the receiver station via 
an uplink channel, or can take a two-hop path that utilizes 
another device as relay (we restrict paths to at most two-hops 
for tight control over network delays). An example is a single- 
cell wireless network with multiple smart phones and one base 
station, where each smart phone has 3G capability for uplink 
transmission and Wi-Fi capability for device-to-device relay 
transmission. 

Such a problem can be cast as a stochastic network opti- 
mization and solved using Lyapunov optimization theory. A 
"standard" method is to minimize a linear term in a quadratic 
drift-plus-penalty expression flD . Ifl2l . This can be shown to 
yield algorithms that converge to optimal average utility with a 
trade-off in average queue size. The linearization is useful for 
enabling decisions to be separated at each device. However, 
it can lead to larger queue sizes and delays. In this work, we 
propose a novel method that uses a quadratic minimization for 
the drift-plus-penalty expression, yet still allows separability 
of the decisions. This results in an algorithm that maintains 
distributed decisions across all devices for format selection and 
routing, similar to the standard (linearized) drift-plus-penalty 
approach, but reduces overall queue size. 

For the derived algorithm, each device observes its input 
queue length and then selects a format to report an event 
according to a simple rule. The routing decision for each group 
of bits is determined at each device by considering its input, 
uplink, and relay queues. Then, allocation of channel resources 
for direct transmission is determined from a receiver station 
after observing current uplink queues and channel conditions. 
For the relay transmission, an optimization problem involving 
relay queues, uplink queues and channel conditions is solved 
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at the receiver station to determine an optimal transmission 
decision. This process can be decentralized if all channels are 
orthogonal. 

Our analysis shows that the standard drift-plus-penalty 
algorithm and our new algorithm both converge to the optimal 
quality of information. The analysis also shows a deterministic 
maximum size of each queue. Simulations show that the new 
algorithm has a significant savings in queue length which 
implies reduction of average delay. 

Because of the generality of the novel method, it is applied 
to solve linear programs in the last section. Linear programs 
are a special case of the stochastic problems treated in fl2l . 
and hence can be solved by the (linearized) drift plus penalty 
method of Lyapunov optimization theory. This is done in lfl3l 
to distributively solve linear programs over graphs. The current 
paper applies our novel quadratic drift-plus-penalty algorithm 
to linear programs to produce smoother results and faster 
convergence. Although a solution of this new technique is the 
time-average of results from multiple iterations, it is different 
from the "dual averaging" method of |[T4l which has a different 
problem construction, and from the "alternating direction 
method of multipliers" in lfl5l which arises from gradient 
descent methods rather than from Lyapunov optimization. 

Thus, our contributions are threefold: (i) We formulate 
an important quality-of-information problem for reporting 
information in wireless systems. This problem is of recent 
interest and can be used in other contexts where "data deluge" 
issues require selectivity in reporting of information, (ii) We 
extend Lyapunov optimization theory by presenting a new 
algorithm that uses a quadratic minimization to reduce queue 
sizes while maintaining separability across decisions. This new 
technique is general and can be used to reduce queue sizes 
in other Lyapunov optimization problems, (iii) We illustrate 
the potential of the quadratic minimization for solving linear 
programs. 

In the next section we formulate the problem. Sec. [Ill] 
derives the novel quadratic algorithm. Sec. [IV] analyzes its 
performance. Sec. [Vj presents simulation results. Sec. [VI] 
illustrates how to solve linear programs. The conclusion is 
in Sec. [VII] 

II. System Model 

Consider a network with N wireless devices that report 
information to a single receiver station. Let Af = {1, . . . , N} 
be the set of devices. The receiver station is not part of the set 
Af and can be viewed as "device 0." A network with N = 3 
devices is shown in Fig.Q] The system is slotted with fixed size 
slots t G {0, 1,2,.. .}. Every slot, format selection decisions 
are made at the transport layer of each device, and routing 
and scheduling decisions are made at the network layer. 

A. Format Selection 

A new event can occur on each slot. Events are observed 
with different levels of quality at each device. For example, 
some devices may be physically closer to the event and 
hence can deliver higher quality. On slot t, each device 
n G Af selects a format /„ (t) from a set of available formats 
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Fig. 1. An example network with illustration of the internal queues K n (t), 
Qn{t), Jn(t) for each device n. 



T = {0, 1, ... , F}. Format selection affects quality and data 
lengths of the reported information. To model this, the event 
on slot t is described by a vector of event characteristics 
(r„ '(t),dn '(t))\ n( z^/j(zjr. The value r„ ' (t) is a numeric 
reward that is earned if device n uses format / to report 
on the event that occurs on slot t. The value d,P(t) is 
the amount of data units required for this choice. This data 
is injected into the network layer and must eventually be 
delivered to the receiver station. To allow a device n not to 
report on an event, there is a "blank format" G J- such 
that (rl 0) (t),d£ 0) (t)) = (0,0) for all slots t and all devices 
n G Af. If a device n does not observe the event on slot t 
(which might occur if it is physically too far from the event), 
then (rtf\t),di f) (t)) = (0,0) for all formats / G T. If no 
event occurs on slot i, then (r« '(i), dn\t)) = (0,0) for all 
n e Af and / G J 7 . 

Rewards r n (t) are assumed to be real numbers that satisfy 
< r n (t) < ri max ' > for all t, where ri™ ax ' ) is a finite maximum. 
Data sizes d n {t) are non-negative integers that satisfy < 
d n (t) < dl™ ax ' ) for all t, where dl max - ) is a finite maximum. 
The vectors (r ( n f \t),d { n f) (t))\ 

neN',feJ r are independent and 
identically distributed (i.i.d.) over slots t, and have a joint 
probability distribution over devices n and formats / that is 
arbitrary (subject to the above properties). This distribution is 
not necessarily known. 

B. Routing and Scheduling 

At each device n G Af, the d n (t) units of data generated 
by format selection are put into input queue K n (t). Each 
device has two orthogonal communication capabilities, called 
(direct) uplink transmission and (ad-hoc) relay transmission. 
The uplink transmission capability allows each device to 
communicate to the receiver station directly via an uplink 
channel. The relay capability allows communication between a 
device and its neighboring devices. To ensure all data takes at 
most two hops to the destination, the data in each queue K n (t) 
is internally routed to one of two queues Q n (i) and J n (t), 
respectively holding data for uplink and relay transmission (see 
Fig.[T]i. Data in queue Q n (t) must be transmitted directly to the 
receiver station, while data in queue J n (t) can be transmitted 
to another device fc, but is then placed in queue Qk{t) for that 
device. This is conceptually similar to the hop-count based 
queue architecture in lfl6l . 

In each slot t, let s« (i) and Sn (t) represent the amount of 
data in K n (t) that can be internally moved to Q n (t) and J n (t), 
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respectively, as illustrated in Fig. [T] These decision variables 
are chosen within sets and S„ \ respectively, where: 

S<*> 4 {0,l,..., S ^ raax )} 
S« 4 {0,1,..., aW™-)} 

where Sn"™ x ', Sn"" 1 "' are finite maximum values. Then the 
dynamics of K„(t) are: 

K n (t + 1) = max[K n (t) - s$(t) - s«(t),0] + d n (t) (1) 



actual amounts of data, but they can be bounded using (01, 
and (|2l) as 



Qn(t 



1) < max 
1) < max 



Mt) -Eme^nmit) + S { r P(t),0\ (9) 
Qn(t)-Un(t) + «W(t),0 

+ E me jv-<w(*)' ( 10 > 

The queue dynamics (fl}, (O, ( TTOb do not require the actual 



variables s„ ""'^i'' (i), fflnm(f), and are the only ones 
needed in the rest of the paper. 



>ct). 



As a minor technical detail that is useful later, the max[- • • , 0] Assume the decision sets U n tt) ar, d A 



operation above allows the Sn\t) and s„ (t) decisions to sum 
to more than K n (t). The actual si?^^ (t) and s« ^ acl \t) data 
units moved from K n (t) can be any values that satisfy: 

(t) + (t) = min[tf„(i), S W (t) + S « (*)] (2) 

0<«W(«*)(t) <«W(*) (3) 

Q<4 i)(act) W <«^W (4) 

Wireless transmission is assumed to be channel-aware, and 
decision options are determined by a vector r/(i) of current 
channel states in the network. Specifically, let u n (t) be the 
amount of uplink data that can be transmitted from device n 
to the receiver station, and let u(t) — {u n (t))\ nG _\f be the 
vector of these transmission decisions. It is assumed that u(t) 
is chosen every slot t within a set U v n) that depends on the 
observed r/(t). Similarly, let a nm (t) be the amount of data 
selected for ad-hoc transmission between devices n and m, 
and let a(t) = {a nm {t))\ n ,meN and a nn (t) = for every t 
and n. These transmissions are assumed to be orthogonal to 
the uplink transmissions. Every slot t, the a(t) vector is chosen 
within a set A^m that depends on the observed r)(t). The sets 
Un{t) an d -^Tj(t) depend on the resource allocation, modulation, 
and coding options for transmission. If each uplink channel 
is orthogonal then set U^t) can be decomposed into a set 
product of individual options for each uplink, where each 
option depends on the component of 77(f) that represents its 
own uplink channel. Orthogonal relay links can be treated 
similarly. 

The dynamics of relay queue J n (t) are: 



J n (t + 1) = max [j n (f) - E m eX a nm(t) + si J ' )(act) (*), 
As before, the actual amount of data anm (t) satisfies: 



(5) 



EmeA^to = mm(j„(f) + s^(t),j: m ^a nm (t) 

(6) 

< (f) < a nm (t) for m G M. (7) 
The dynamics of uplink queue Q n (f ) are: 

Q n (t + 1) = max [Q n (t) - u n (t) + s(? )(act) (f), 

+ E m ^tn ] (t)- (8) 

Notice that all data transmitted to a relay is placed in the 
uplink queue of that relay (which ensures all paths take at 
most two hops). The queueing equations © and (JHJ involve 



r,(t) dnu -^Tj(t) 

transmissions have bounded rates. Specifically, let u\{ 

(max' 



ensure that 
and 



a nm ' be finite maximum values of u n (t) and a nm (t). Fur- 



„(?)(» 



> u n m:ix> and 



ther, assume that for each n G TV, s' r , 
s 0')(max) ^ EmeAA a ™" X ^' so that the maximum amount that 
can be internally shifted is at least as much as the maximum 
amount that can be transmitted. 

C. Stochastic Network Optimization 

Here we define the problem of maximizing time-averaged 
quality of information subject to queue stability. We use the 
following definitions fl2l : 

Definition 1: Queue {X(t) : t G {0, 1, 2, . . .}} is strongly 
stable if 

limsup \ Et=o E i x ( T )} < 00 

t— >oo 

Definition 2: A network of queues is strongly stable if every 
queue in the network is strongly stable. 

In words, definition Q] means that a queue is strongly stable 
if its average queue backlog is finite. 

Let yo(t) = 2~2neAf r n(t) be the total quality of information 



„(max) 



IS 



from format selection on slot f, and y^ = 2~2 n 
its upper bound. The time-averaged total information quality 
is 

Va = hminf \ Et=o E {Vo(t)}- 

For simplicity of notation, let u(t) represent a collective 
vector of event and channel randomness on slot f, and let 
a(t) be a collective vector of all decision variables on slot f: 

w(t) 4 [r,(t);(rV\t),dif\t))\ ne ^ fe r] 
a(t) 4 [a(t); U (t);(Mt))\ ne ^( S ^\t),s^(t))\ ne ^] 
It is our objective to solve: 

Maximize yo (11) 
Subject to Network is strongly stable 
ot{t) G for all f, 

where & u (t) is a feasible set of control actions depending on 
randomness at time f. So, any selected a(t) G § u (t) yields: 

f n (t) G T for all n G TV 

s^(t) G S n q) for all n G M 

s^(t)eS n ]) for all n G TV 

u(t) G U v ( t) 

a(t) G Aj(t) 

This problem is always feasible because stability is trivially 
achieved if all devices always select the blank format. 
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III. Dynamic Algorithm 



Proof: 



This section derives a novel "quadratic policy" to solve 
problem (fTTb . The policy gives faster convergence and smaller 
queue sizes as compared to the "standard" drift-plus-penalty 
(or "max-weight") policy of ITD . JT2). 



A. Lyapunov Optimization 

Let 0(i) = (K n (t),Q n (t), J n (t))\ neA r represent a vector 
of all queues in the system. 

Define a quadratic Lyapunov function L(&(t)) = 

\ EneJV l K n(t) + 0«W + Jn(t)] ■ Then the Lyapunov drift, 
the difference of Lyapunov functions between two consecutive 
slots, is defined by L(@(t + 1)) - L(&(t)). 

In order to maximize y~Q in dTTT l. the drift-plus-penalty 
function L(0(t+1)) — L(&(t)) — Vyo(t) is considered, where 
V > is a constant that determines a trade-off between queue 
size and proximity to the optimalityQ 

Later, this drift is used to show stability of a system. 
Intuitively, when queue lengths grow large beyond certain 
values, the drift becomes negative and a system is stable 
because the negative drift roughly implies reduction of total 
queue lengths. 

Let R and M + denote the set of real numbers and non- 
negative real numbers, respectively. 

Lemma 1: Let a,el and bj E R+ for i e {0, 1, 2, . . . , A} 
and j £ {0, 1, 2, . . . , B}. Assume further that |<2j| < a^ max ' ) 



and 

x e 



bj\ < bj 3 ^ for each feasible i and j. Then for any 



< Ef=i(z + «0 2 + E.f=i(z + b,) 2 -(A + B)x 2 + C 



< 2x 



where 



C = 2 



Ei=i a i + E,-=A 



C 



(12) 
(13) 



X^ A x^i~l (max) (max) X^B V* 5 ' -1 h < - m:ix >h < - n 
i v^-B (max) t 



(max) i (max) 



c" = 



V^^4 (max) V^-^ L( max ) 



Note that the first bound ( fT2l is used in the quadratic policy, 
while the second bound dT3l l can lead to the max-weight policy. 



1 The minus sign in front of V yo (t ) is because the quality of information 
can be viewed as a negative penalty. 



<(*+E,ti *if+{Y.f =1 b ] ) 2 +2J2f =1 b 3 \*+Y.ti ^|-* 2 

< 2K y:f =1 a»+Ef=i « 2 +2Ef =1 E'ri K<vi+Ef =1 & 2 

+2EjLi Ejr = \ b.fy +2Ef = i fc, |x+Ef =1 
=2* Ef =1 a*+Ef =1 « 2 +2 Ef =1 Eir = \ k«v |+Ef =1 b 2 

+2Ef =1 Ejri ^fy +2Ef = i 6^+2 Ef=i Ef =1 
=Ef=i C^+«i) 2 +E?=i (x+b^'-fA+s)^ 2 

+2Ef = i Eir = \ |oiO i /|+2Ef=i E^ 6j6i'+2Eti Ef=i 
<Ef = i (*+«i) 2 +Ef =1 (x+fc^M^+B^+C (14) 

<2x[E; 4 =1 a*+Ef =1 6i]+Ef =1 af" )2 +Ef =1 fcf" )2 +c 

=2x[Ef =1 a I +Ef =1 f> J ]+C (15) 

Inequalities (fL4T > and ( fTBI l prove respectively relation ( fT2l and 

03). ■ 

Using queuing dynamic (Q~|), (O, and ( [Tol l, the drift-plus- 
penalty is bounded by ( TToT i below. Then, using relation (fT2l . 
the bound becomes (TiTl i. 

L(0(T+l))-i(0(r))-yy o (r) 

<IEn e ^{[m£uc(K„(T)- S W(r)- S «>(r),0)+d„(T)] 2 -if„(r) 2 
+ [max(Q„(r)-t l „(T) +;i ( 1 '')(r),0)+E meA ra m „(r)] 2 -Q„(r)' 1 



+ [ma X (J„(T)-E mejV a Jlm (r)+ S < 1 '' ) (r),0)] 2 -J„(r) 2 -2yr„(r)} 



<^E„ e Ar{K^)-4' !) (^)] 2 + K(^)-^ ) (^)] 2 + [^(r)+<i„(r)] 
+ [Q„(r)-«„(r)] 2 + [Q„(r)+ S < l ")(r)] 2 +E me x[Qn(r)+a m „(r)]' 
+ E mejV [" f "M-' 1 " m M] 2 + [- 7 "M+^ ) M] 2 -2Vr„(r)+D„(r)} 



(16) 

2 
r2 



(17) 



where 

D„(t)^-3A' 2 (t)-(2+|JV|)Q 2 ( T )-(1+|AT|) J 2 (r) 

+2s (")(»'») ;5 U)(™) + 2 s <')<'™) t i< m " x )+ 2s U)(max) d("'») 



+2ti 



(max)„(,)(max)_ 



-2«<T° E„ 



'+24 



'E, 



+E m£ A/' E m 'i=Af~{m} 



(max) (mm) 



+2s„ 



EAT ' 
„(max) 



(max) 



Minimizing the actual drift-plus-penalty term ( IToT ) is com- 
putationally expensive. In this paper, we propose a novel 
quadratic policy, derived from ( fTTb . that preserves the 
quadratic nature of the actual minimization while keeping 
decisions separable. As a result, the policy leads to a separated 
control algorithm in Sec. IIII-BI 

Definition 3: Every time t, the quadratic policy observes 
current queue backlogs &(t) and randomness u)(t). Then it 
makes a decision according to the following minimization 



ARXIV 



5 



problem. 
Minimize £ 

+ [Kn( 

+E meJV [ ( 3«(*)+ a ".«(*)] 2 +i: me Af[' / "( t )- a ™( t )] 

+ [j n (t)+^ } (t)} 2 -2Vr n (t)} 



:„eAA{[^n(*)-^ 5) (t)] 2 + K(*)-4 J) (*)] 2 
-[K„(t)+d n (t)] 2 + [Q„(t)-u„(t)] 2 + [Q„(t)+ s ( 1 ")(t: 



Subject to 



/„(t)eJ r ,d„(t)=d</" <t)) (t), r„(t)=r£ /,,<: ' )3 (t) VneAA 



B. Separability 

The control algorithm can be derived from the quadratic 
policy in definition [3] The whole minimization can be done 
separately due to a unique structure of the quadratic policy. 
This leads to five subproblems, as described below. 

At every slot t, each device n £ Af observes input queue 
K n (t) and options (rk (t), dn \t))\feF- It then chooses a 
format f n (t) according to the admission-control problem: 

r i 2 

Minimize K n (t) + 4 /n(t)) (*) - 2Vr n fni - t)) [t) (18) 

Subject to f n (t) G T 

This is solved easily by comparing each option f n {t) G T. 

Each device n moves data from its input queue to its uplink 
queue according to the uplink routing problem 



Minimize 



K n {t)-s^\t) + Q B (i) + #>(*) 



(19) 



Subject to s^(t) e S n 9 \ 
This can be solved in a closed form by letting Iq{t) = 
r *" ( V (t) 1. W = [ K ^ {t) \ and M (x,t) = 



[K n (t) - 



[Qn(t) + x] . Then choose 



(20) 



argmin 

a; ( 

o 



{4<*),/ Q (*)} 



, -ff„(t)-Q n (t)>2s< l '' ><m " x) 
9 Q (:c,t) , 0<K„(t)-Q„(t)<2s(«»( m ™» 

, Jf„(i)-Q„(t)<0 



Also each device n moves data from its input queue to its 
relay queue according to the relay routing problem 



The uplink allocation problem is 

Minimize I]„ eA f[Qn(*) - u n (t)]' 
Subject to u(t) G H v (t)- 



(23) 



This can be solved at the receiver station. If all uplink channels 
are orthogonal, the problem can be decomposed further to be 
solved at each device n by 

i2 



Minimize [Q n (t) ~ u n(t)Y 
Subject to Unit) E U n ^(t), 



(24) 



where U n>TJ n\ is a feasible set of u n (t). An optimal uplink 
transmission rate is the closest rate in U n>rl i t \ to Q n (t). 
The relay allocation problem is 

Minimize £„ GjV £ me jv{[<9n(*) + a mn {t)f 

+ [J n (t) - a nm (t)] 2 } (25) 

Subject to a(t) G ^(t)- 

If channels are orthogonal so the sets have a product form, then 
the decisions are separable across transmission links (n, m) for 

n G A/", m G A/" as 



Minimize [Q m (t) + a nm (t)] + [J n (i) 
Subject to a nm (t) G A m ,»7(t), 



(*)] ( 26 ) 



where -4„m,77(t) is a feasible set of a nTO (t). The closed form 
solution of this problem is 



a<™ x) 



(27) 



J„(t)-Q m (t)>2a<™? 3 
0<J„(t)-Q m (t)<2a^ ) 

J»(*)-Om(t)<0 



where 7^ (i) 



argmin oe _4 nm ^ (t) 



I A (t) 4 argmin oe ^ m)r)(t) _ {z + (t)} 

5,4 (x, t) = [,/„(*) - x] + [Qm(*) + x' 



■7n(«)-Qm(t) 

2 

J„W-Qm(t) 

2 



and 
and 



C. Algorithm 

At every time slot i, our algorithm has two parts: device 
side and receiver-station side. 



Minimize 



K n {t) 



it) 



J nit) + sit it) • (21) Algorithm 1: Distributed format selection and routing 



Subject to s^it)eS n j) 
Again, let /+ (t) 4 [ MhM |,iy( t ) = | **W-J»W j and 



2 I ' J \jv L 2 

gj(x,t) — [K ri it) — x] + [Jnif) + x] . Then choose 

<#>(*) = 



(22) 



// Device side 
foreach device n G A/" do 

Observe K n (t),Q n it) and J„(t) 
Observe (rtf> (t) , S n f) (i))|/e^ 
Select format according to ( fT8b 

Move data from K n it) to Q n it) and J n (t) with 



) 00( n «0 , K„(t)-J„(i)>2s<f>( max > 

) , K„(t)-J„(t)<0 

Note that the solutions from the quadratic policy are 
"smoother" as compared to the solutions from the max-weight 
policy that would choose "bang-bang" decisions of either or 

s (9)(max) for s (g) (i) ^ Q Qr ^(mas) foj . ^ ^ 



(*). 



,W)(act) 



(t) satisfying ©-© and ©-(Ell 



with values of s„ (t),s„ (t) calculated from (f20l > 
and 



end 



After these processes, queues K n (t + l),Q n (t + 1) and 
Jnit + 1) are updated via (Q}, ©, ©. 
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Algorithm 2: Uplink and Relay resource allocation 

// Receiver-station side 
for receiver station do 

- Observe (Q„(t), J n (t))\ ne ^ 

- Observe U n (t) and Ar,(t) 

- Signal devices n E Af to make uplink transmission 
u{t) according to (f23T > 

- Signal devices n € Af to relay data a(t) according 
to (ED 

end 



IV. Stability and Performance Bounds 

Compare the quadratic policy with any other policy. 
Let (/„(r),sl 9 ''(T),s I ( f ) (r))|„ eA ^,M(r),a(T) be the decision 
variables from the quadratic policy in definition [3] From 
fn(r), r n (t) 4 r\t (t) \t) and d n (t) 4 <#»<*»(*). Then, 
let (/„(r),sl' ?) (T),s^ ) (r))|„ eA A,'u(T),a(T) be decision vari- 
ables from any other policy and f n {t) = rn (t), d n (t) = 
dn n ^\t). From ( fT71 ) and definition [3] the drift-plus-penalty 
under quadratic policy is bounded by d28l ) and is further 
bounded by d29l ) under any other policy as 

I(8(r + l))-L(8(T))-Vn,(t)(T) 

<3E„ eA A{K(^)-4' !) M] 2 + [K„(r)- ;i O)(r)] 2 + [K„(r)+d„(r)] 2 
+ [Q„(r)-n„(r)] 2 + [Q„(r)+ S < l ")(r)] 2 +i: meAf [Q„(r)+a m „(r)] 2 
+E mejV [- 7 "M- Q "".^)] 2 + [^M+ s ! l j) M] 2 -2yr„(r)+D„(r)} 

(28) 

<3E, ie AA{[^^^)-4 <,) (r)] 2 + [K„(r)-^»(r)] 2 + [K„(r)+<i„(T)] 2 

+ [Q„(r)-u„(r)] 2 + [Q„(r)+^)(r)] 2 +[Q„(r)+y: meAr a m „(r)] 2 

+ [jn(T)-E meA f ^ m (T)] 2 + [./„(r)+sO)(r)] 2 -2yf„(r)+D„(r)}. 

(29) 

From the bounds [\3[ , it follows that 

L(©(t + 1))-Z(0(t))-^o(t) 
< EneA K n(r) \d n (r) - ^ } (r) - s«(r) 



+ Q n (r) 



meA/" 



+ J„(t) Sn'W-EmeAf^W 

-^f„(r)} + £ 



(30) 



where 



E 



— _ v / r<i(<?)( max ) 4_ s (j)(™ x ) i ^( max ) 



+ 



g ( 9 )(max) _|_ M (max) 



E 



(max) 



E 



(max) 



(31) 



The derivations (f28t — (130b show that applying the quadratic 
policy to the drift-plus-penalty expression leads to the bound 
fl30l > which is valid for every other control policy. However, 
the linear minimization of d30l ), which leads to the max-weight 
policy, does not resemble quadratic minimization of the actual 



drift-plus-penalty term ( fT6l ). The effects of the two policies are 
revealed in Sec. [V] where the quadratic policy leads to smaller 
queue backlogs. 

As discussed in Sec. [TTJ oj(t) is i.i.d. over slots and is 
assumed further to have distribution 7r(w). Define an w-only 
policy as one that make a (possibly randomized) choice of 
decision variables based only on the observed ui(t). Then we 
customize an important theorem from ifTTl . 

Theorem 1: When problem ( fTTT ) with stationary distribu- 
tion it(uj) is feasible, then for any 5 > there ex- 
ists an w-only policy that chooses all controlled variables 

(f*(t),s { n q) *(t), S ^*(t))\ neM ,u*{t) 7 a*(t), and for all n e 
Af: 



E 



E{y*(t)}<y ( ^ +6 (32) 



(*) 



,00 



*(*)} 



< s 



E 



{sW*(t) + E ro6A T«mn(*) "<(<)}< S 



(33) 

(34) 
(35) 



where Uq ^ is the optimal solution of problem ( fTTT i. Also, 

%*(*) = E„ eA rC(*) when r*(t) ^ r r ( /" (t)) (t) and d* n (t) 4 
F»(t). 

We additionally assume all constraints of the network can 
be achieved with e slackness IfTTl : 

Assumption 1: There are values e > and < y^ < 
^(max) an( j an w - nly policy choosing all controlled variables 

(/*(*), s^*{t),s^*(t))\ ne x,u*{t),a*(t) that satisfies for 
all n e Af: 



E 



E^'W-E^ <*.(*)} 





Vo 


(36) 


< 


— e 


(37) 


< 


— e 


(38) 


< 


— e. 


(39) 



A. Performance Analysis 

Since our quadratic algorithm satisfies the bound d30l ), 
where the right-hand-side is in terms of any alternative pol- 
icy (h n {t),s^\t),s^\t)\ \ nE M,u(t),a(t), it holds for any 

w-only policy (h* n (t), s^*(t), sl J> (*)) Uat, u*(t), a* (t). 
Substituting an w-only policy into (f30b and taking conditional 
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expectations gives: 

E{L(0(r + 1)) - L(0(r)) - V V() (t)\@(t)} (40) 

< E„ ej v{^(r)E{<(T) - «<?>*(r) - S l J> (r)| 0(r)} 
+ Q„(r)E { ,sl 9) *(r) + E^CW " <W| 8(r)} 

+ J n (r)E { «£>(t) - Em6Ar<mW| 

-yE{r*(r)|0(r)}}+^ 

< E„ eJ v{^n(r)E {<(r) - S l 9) *(r) - S <?>(r)} 
+ Q„(t)e{ S L 9) » + EmeW a mn( r ) - <(r)} 
+ J„(r)E{^>(r) - E meJV CW} 
-FE{r*(r)}}+E (41) 



where we have used the fact that conditional expectations 
given 0(£) on the right-hand-side above are the same as 
unconditional expectations because w-only policies do not 
depend on 0(f). 

Theorem 2: If Assumption [T] holds, then the time-averaged 
total quality of information yo is within 0(1/ V) of optimality 
under the quadratic policy, while the total queue backlog grows 
with 0(V). 

Theorem [2] is proven by substituting the w-only policies 
from Theorem [T]and Assumption Q] into the right-hand-side of 
( 14 U . as shown in the next subsections. 

1 ) Quality of Information vs. V: Using the w-only policy 
from d32li-(l35ll in the right-hand-side of d4TT > gives: 

E {L(0(r + 1)) - L(0(r)) - ^ (r)|©(r)} 
< E - V (yM + 6) + 5 EneJV [*»(r) + Qn(r) + J„(r)] 



This inequality is valid for every S > 0. Therefore 

E{i(0(r + 1)) - L(0(r)) - Ky o (r)|0(r)} < E-Vy^. 
Taking an expectation and summing from r = to t — 1: 

E {£(©(«)) - L(0(O)) - ^Et=o2/o(r)} < 25i - Vty^. 
With rearrangement and L(0(£)) > 0, it follows that 



> 



"7 



(opt) 



^(Q(0)) 



Dividing by t and taking limit as t approaches infinity, the 
performance of the quadratic policy is lower bounded by 



1 

Uminf-EE{yo(r)}>- 



T = 



- + w (opt) 



(42) 



This shows that the system can be pushed to the optimality 
2/o° Pt ' ) by increasing V under the quadratic policy. 



2) Total Queue Backlog vs. V: Now consider the existence 
of an w-only policy with Assumption Q] Using (|36ll-(l39ll in 
the right-hand-side of fill gives: 

E{L(0(r + 1)) - L(0(r)) - Uy o (r)|0(r)} 

<E- Vy^ - 6E nejV [tf„(r) + Qn(r) + J„(r)] . 

Taking expectation and summing from r = to £ — 1: 

E {l(©(*)) - L(©(0)) - FEr=o2/o(r)} 
< ^-W^-eE^oEneJV-E^nCr) + Q„(r) + J„(r)} 

With rearrangement and L(0(£)) > 0, it follows that 

Et=oE„ e A/-M#«(r) + Qn(r) + J n (r)} 

< f + ¥ (Et=oE{ yo (r)} - - ^™ 



< M + Z _ t y W) + MM|Mli. 



Dividing by t and taking limit as t approaches infinity, the 
time-averaged total queue backlog is bounded by 



limsupt^ \ J2 T=0 E„eAr E { K n{r) + Q n {r) + J„(r)} 



< 



7 + 7 (vir'-yP) 



(43) 



This shows that the overall queue length tends to increase 
linearly as V is increased. This is an asymptotic bound which 
shows that every queue is strongly stable, and the network is 
strongly stable. 

The V parameter in d42l) and d43l affects the performance 
trade-off [0(1/V), 0(V)] between quality of information and 
total queue backlog. These results are similar to those that 
can be derived under the max-weight algorithm. However, 
simulation in the next section shows significant reduction of 
queue backlog under the quadratic policy. 



B. Deterministic bounds of queue lengths 

Here we show that, in addition to the average queue size 
bounds derived in the previous subsection, our algorithm 
also yields deterministic worst-case queue size bounds which 



is summarized in the following lemma. Define K„ 



(max) 



2Vri 



2,11 



dn X> for n G Af, and Q„ 



(max) 



max 



K 



(max) 



,{^r x) } 



m£Af 



E, 



(max) . (max) 



Lemma 2: For all devices n 6 Af and all slots t > 0, we 
have: 



K n {t) < 
Jn(t) < 

Qn(t) < Q^ X) 



(44) 
(45) 
(46) 



provided that these inequalities hold at t = 0. 

Proof: The bounds d44ii-(|4"6Tl are proved in Section 
IIV-B ltilV-B3l respectively. ■ 
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1 ) Input Queue: From the admission-control problem ( fT8l ), 
if ( r n{t) , d n (t)) = (0,0), then the objective value of 
the problem is K n (t) 2 . Therefore, device n only chooses 

(r n (t),d n (t))^ (0,0) when 

[K n (t) + d n (t)} 2 - 2Vr n (t) < K n (tf 
2K n (t)d n (t) + d n (tf - 2Vr n (t) < 
2Vr n (t) - d n (t) 2 




P{« 1 *-"(t) = a.} = 



(1.2 



«(«) = *} = 



11.2 



-- 10 



0.1 1 = 1 ( 0.1 1 = 5 



0.9 a: = 20 
10 



Fig. 2. Small network with independent channels with distributions shown. 



K n {t) < 



2d n {t) 



< max ■ 



2Vr 



(/) 



7 (/)2 



m e A/\ so Q n (t + 1) < Q(t) < Q^ r 
ii) When Q n (t) < max if., 



(max) 



(max) 



|id mx) } 



then this 



^ ^' queue may received data si'' (t) and a mn (t) for some m G A/", 



This implies that device n can only obtain data when (l47l i 
holds, and receives no new data otherwise. 

Fix t, and assume K n {t) < K^r x) for this slot t. From ©, 
there are two cases to consider. 



so 



(max) 



I) If < K n (t) < Kii 
K n (t+l) = K n (t)+d n (t) <K, 



.(max) 
(max) 



then (|47j holds and 



ii) If K n {t) > K, 



(max) 



7 (max) 
n •> 

-(max) 



Qn{t H~ 1) ^ max Q n (t) + S^(t),0\ + amn® 

< Q(t) + 4 9)(raax) + E a ™« x) 

< ol max) . 



then d47l > does not hold and 



if 9 (t + 1) = *T„(t) < ^ max) . Thus, given that K n (0) < Thus, given Q n (0) < Q { ^\ Q n (t) < QT" 1 for all t > 

by mathematical induction. 

V. Simulation 

Simulation under the proposed quadratic policy and the 
standard max-weight policy is performed over a small network 
in Fig. |2] The network contains two devicess, Af = {1,2}. 
Each device has the other as its neighbor, so Hi = {2} and 
H2 = {!}■ An event occurs in every slot with probability 
9 = 0.3. We assume all uplink and relay channels are 
orthogonal. The uplink channel distribution for device 1 is 
better than that of device 2 as in Fig. [2] 

The constraints are u n (t) G {0, . . . , Un l \r](t))} for 
n G Af. Also, a 12 (t) G {0, . . . , af 2 est) (rj(t))} and a 2 i(t) G 



■j (max) 



id™ x) , K n (t) < K ( n' dx) for all t > by mathematical 
induction. 

2) Relay Queue: Fix t and assume for each device n G Af 
that J n (t) < K^ 3 ^ for this slot t. From the closed form 



solution d22l i and (0, there are three cases to consider, 
i) When K n (t) - J n (t) < 0, then s ( r P(t) = 0, and 

J n (t + 1) < max \j n (t) + s<j\i), 



Jn(t) <K^\ 



ii) When K n (t) - J n (t) > 2s 



2s<f )(lnax) ), then s%>(t) = sX 1 ^', and 



_ (i)( max ) 



(or J n (t) < K n (t) - 



J n (t + 1) < max 
< max 



Jn(t) 



,(.;) 



(t),0 



(q) (max) 



Jj)(max) 



K n (t)-s^™*\0 



= 30. 



<K n (t) <K^\ 



iii) When < K n (t) - J n (t) < 2s 
2 



0> 



then s$(t) < 



and 

J n (t + 1) < max 
< max 



K n (t) + j n {ty 



.0 



< K n (t) < K* 



(max) 



Thus, given that J„(0) < K ( ™ x) , J n (t) < K ( n mRx) for all i > 
by mathematical induction. 

3) Uplink Queue: To provide a general upper bound for the 
uplink queue, we assume that all relay channels are orthogonal. 
This implies every device n G Af can transmit and receive 
relayed data simultanously. 

Fix t and assume Q n (t) < Qn* for this slot t. Then 



consider Q n (t + 1) from (©. 
i) When Q n (t) > max K, 



(max) 



,|id raax) | 



, from 



and ([27j, it follows that s£'(t) = and a mn (t) = for all 



{0,...,4rW))}- Then set fl « 
The feasible set of formats is T — {0,1,2,3} with con- 
stant options given by (di\ 0) ,ri 0) ) = (0, 0), (d£\ r£^) = 
(100, 20), {d!n\ ri 2) ) = (50, 15), (dif\r { ^) = (10, 10) when- 
ever there is an event. 

The simulation is performed according to the algorithm in 
Sec. IIII-CI The time-averaged quality of information under 
the quadratic and max-weight policies are shown in Fig. [3] 
From the plot, the values of yo under both policies converge 
to optimality following the 0(1/ V) performance bound. 

Fig. H^bc reveals queue lengths in the input, uplink, and 
relay queues of device 1 under the quadratic and max-weight 
policies. At the same V, the quadratic policy reduces queue 
lengths by a significant constant compared to the cases under 
the max-weight policy. The plot also shows the growth of 
queue lengths with parameter V, which follows the 0(V) 
bound of the queue length. Fig. HJi shows the average total 
queue length in device 1 under the quadratic and max-weight 
policies. 

Fig. [5] shows that the quadratic policy can achieve near 
optimality with significantly smaller total system backlog 
compared to the case under the max-weight policy. This shows 
a significant advantage, which in turn affects memory size and 
packet delay. 



ARXIV 



9 



Quality of Information vs. V 
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Fig. 3. Quality of Information versus V under the quadratic (QD) and max- 
weight (MW) policies 
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Fig. 4. Averaged backlog in queues versus V and system quality versus 
backlog under the quadratic (QD) and max-weight (MW) policies 
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Fig. 5. The system obtains average quality of information while having 
average total queue length 
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Fig. 6. Larger network with independent channels with distributions shown 
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Fig. 7. Convergence of time-averaged quality of information. The interval 
of the moving average is 500 slots. 



Another larger network shown in Fig . [6] is simulated to ob- 
serve convergence of the proposed algorithm. As in the small 
network scenario, the same probability of event occurrence 
6 = 0.3 is set. Channel distributions are configured in Fig. 
[6] For V = 800, the time-averaged quality of information is 
25.00 after 10 6 time slots as shown in the upper plot of Fig. 
[7] The lower plot in Fig. [7] illustrates the early period of the 
simulation to illustrate convergence time. 

VI. Linear Programs by Quadratic Policy 

The generality of the quadratic policy is illustrated in this 
section. The policy is applied to solve linear programs which 
is one application of the Lyapunov optimization [12|. 

A. Problem Transformation 

The following static linear program is considered where 
are decision variables and {aji)\°~L^f~™ , (bj)\JL 1 , 

( c i)l?=i> (x\ m! °^ J |™ =1 are constants. 



Maximize 



Cj3 l 



(48) 



Subject to ajiXi < bj, j S {!,..., m} 



i=l 

< Xi < x 



(max) 



i € {l,...,n} 



In order to solve (08]), the following time-averaged optimiza- 
tion problem is solved by using the Lyapunov optimization 
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technique. 

n 

Maximize £/q = CiXi 



(49) 



i=l 



Subject to ajiXi < bj, j e {!,..., m} 



< Xi(t) < x 



(max) 



£ £ {l,...,n},t > 



Solutions from the static problem (l48l l and the time- 
averaged problem d49l ) are equivalent because using a solution 
to the static problem for every t in the time-averaged 
problem leads to Xi — Xi for i £ {l,...,n} and every 
constraint in the time-averaged problem is satisfied. The time- 
averaged objective function is also maximized, since the time 
average of the linear function is equal to the function of the 
time averages. On the other hand, a solution to the time- 
averaged problem is a solution of the static problem because 
it satisfies all constraints and maximizes the same objective 
function. 

To solve problem d49l ), a concept of virtual queue is used 
|12|. Let Xi(t) be chosen every slot t in the interval < 

Xi(t) < x[ mlix \ and define Xi(t) for i £ {1, . . . ,n} as 

a 1 



T = 



Define virtual queue 



Zj(t+1) = max 



z A t ) + ^2 a ji x i(t) -bj,0 



(50) 



for all j £ {1, . . . , m}. It follows that 



Zj(t + 1) = max 



1=1 

a 

Zj(t + 1) - Z 3 (t) > ajMr) - bj. 
i=i 

Summing from r = to t — 1, and dividing by t: 

t-l n 

Zj(t) - Zj(0) >J2Y1 a J^( T ) ~ th 3 

T = 4=1 

t=0 i=l 

71 

= ^2ajiXi(t) -bj, 



(51) 



where we assume that Zj(0) > 0. It follows that if ^ -> 
(so that each queue is "rate stable"), the desired time-average 
inequality constraint is satisfied. 

Then let ®(t) = {Zj{t))\f =l be a vector of all virtual 
queues and 



be the objective function whose time average is to be mini- 
mized according to the problem (|49l . Define a time-averaged 
objective value up to iteration t by 



Vo 



Similar to Sec. III-CI let yo = limt-^oo Vo(t) be and X{ = 
lim^oo Xi(t) be their asymptotic averages. 

B. Lyapunov Optimization 

To solve (|49l , the drift-plus-penalty for this problem is 
bounded by Lemma Q] as 



L(&(t + l))-L(&(t))-Vy Q (t) 



i Er=i{^+ i )-^ ? (*)- 2 ^w} 



2 ^J = l l"j ' 

= 5 Ej=i{ max [^(*) + E"=i - 6j,0] 

- Z?(t) - 2V r c i a: i (t)} (52) 

< 3 ££i{E?=i W) + aW«)] 2 + [^(*) ~ ^ 2 

- (n+ l)^ 2 (i) - ZVctX, + fl,-} 

+ I Ejli {&•(*) - - (n + l)^(t) + ^-}, 

(53) 



where 



^=2{EtiEI^h 



(max) (max) 



z 1 1 wjv 



+ T.U\aj i \\bj\x^ ) } 

for j £ {1, . . . , m}. From (l53l l. the quadratic policy minimize 
the drift-plus-penalty every iteration, and this minimization is 

Minimize Yh=\ {Ej=i + a jl x l (t)] 2 - 2Vc l x i (t)} 

(54) 



Subject to < Xi{t) < x 



(max) 



i £{!,... ,n}. 



Again, because problem's structure and the fully separable 
property of the quadratic policy, problem d54b can be solved 
separately for each Xi(t). A closed form solution of each Xi(t) 
for £ £ {!,... ,n} is 



i(t) = 



CiV - Y^^a^Zjjt) 
V™ a 2 



(max) 



.o 



i=l 



C. Algorithm 

An algorithm to solve problem d49l . which also solves d48l l. 
is the following. 

D. Convergence Analysis 

Since our policy chooses Xi{t) £ [0,cc^ max ^] every slot to 
minimize the right-hand-side of ( f53l ), this right-hand-side is 
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Algorithm 3: Linear programming by quadratic policy 
Initialize {©(0)} = 

t = 

foreach iteration t > do 

// Update decision variables 
foreach i e {1, . . . , n} do 



Xi(t) = max 



c iV-J2T=l a-jiZjW (max) 



end 



,o 



end 



// Update virtual queues 
foreach j e {1, . . . , m} do 

Z j (t + l) = 

max [£,•(*) + (J2i=i a>jiXi{t) - bj),0] 
end 

t = t+l 



less than or equal to the corresponding value with any other 
feasible decision x* € [0,x^ max ^]: 

L(&(T + 1))-L(&(r))-Vy (r) 

< \ Eti {E™ i [Zj(t) + a jiX *(t)] 2 - 2Vc i x*(t)} 
+ | E™ i - 6,] 2 - (n + l)^(t) + 

< E,=i %WE:Li a iia :J(r) - 6,] - VELi + E 

(55) 

where 



3=1 



X! l a J'il :C i maX) + 6 3 



and the final inequality uses dot . 

Assume that problem (l48b has a;* = (x*)|™ =1 as an optimal 
solution and y£ v ^ as the optimal cost. This optimal solution 
has the following properties: 



(opt) 

Vo 



^axx? < bj, je{l m}, 



i=l 



By applying Xj(t) = x* every iteration, the bound ( |55l l 
becomes 

L(e(r + l))-i(e(r))-y» (r) 

<-^ opt) + £ 

Summing from t = to t — 1 and rearranging lead to 



L(&(t)) - L(&(0)) - V J] ^°( r ) ^ " 



(opt) 



r=0 



and 



> 



L(®(t)) - L(@(Q)) - Et 
V 



(opt) 



Since Algorithm |3] initializes 0(0) = 0, £(©(0)) = and 
also L(&(t)) > 0. Then, dividing by t leads to 



1 * _1 



> 



L(e(t)) e 



T = 
t-1 



tv 



V 



y opt) . 



(56) 



(57) 



The bound ( l57b shows that, when V is large, the time-averaged 
objective value from the algorithm approaches the optimal 
objective value. 

Since the feasible set of (xi(t))" =1 in problem d49l is 



bounded, there exist some y^ ^ > such that yo(t) < y$ 
for all t. Then, the bound d56b can also be rearranged to be 



(max) 



t-1 



L(&(t))<Et + Vj2yo(T) 

T=0 

m 

y Ez?{t)<2M + 2Vt[y<T : 

3 = 1 



(opt) 



(opt) 



Zj(t) < J2Et + 2Vt 



(max) 

Vo 



(opt) 
Vo 



< 



{ 



2E + 2V 



(max) 

Vo 



y opt) 



(max) 

Vo 



(opt) 

Vo 



}■ (58) 



From d5Tl >. it follows that 

^2ojiXi{t)-bj <J-{2E + 2V 

8=1 ' 

The bound ( 1581 shows that the constraints of problem ( |48T > are 
asymptotically satisfied as t approaches infinity. 

When the number of iterations is limited, we can obtain 
convergence results in this case by assuming V = 1/e and 
t = 1/e 3 and consider ( TSTI ) and ( l58l . This leads to 



..(opt) 

Vo 



1 * _1 



< Ee = 0{e). 



T = 



and 



a jiXi (t) -bj<Jjj^{2E + 2j e x 



1/e 3 

0(5). 



(max) 

Vo 



v^ 



(59) 



Therefore, using 0(l/e 3 ) iterations ensures the time-averaged 
value of yo(t) is within 0(e) of the optimal value 



.(opt) 



and all 



constraints are within 0(e) of being satisfied. However, This 
0(l/e 3 ) tradeoff can be improved to an 0(1/ e 2 ) tradeoff if 
the problem (|49l satisfies a mild "Slater assumption" as the 
following. 

Assumption 2: There are values e > and y Q ° p ^ < y^ < 
^(max) anc j a stat j c p ij C y choosing (x*)™ =1 every iteration that 



satisfies: 



y*o(t) 



y { o ] 



En 
i=l a ji 3 



bj < 



< x* < x. 



e 

(max) 



j e {!,..., m) 
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In fact, this assumption is a static version of Assumption Q] 
and is similar to a general Slater condition in the convex 
optimization theory ifTTl . 

Applying Assumption (O to 05] ). it follows that 

L(®(r + 1))-L(&(T))-Vy (r) 

< -V Eti + £™i ^ (r) E?=i anxl-bA + E 

From triangle inequality, ||Z(r)|| < Ej=i -^j> trie arjove 
inequality is 

L(®(t + 1))-L(®{t))-V Vo (t) 



<-Vy^-e\\Z(r) 



E 



Since L(0(r)) = ±||Z(t)|| 2 , we have: 
\Z(r + l)f -||Z(r)|| 2 <2^(y (r) - 

If ||Z(r)||> V ( yr> -^) +g ,then 



e||Z(r) 



E 



TABLE I 

Numerical results from an example problem 





Quadratic 


Max-weight 


Optimal 


xi (500) 


2.531 


2.540 


2.500 


x 2 (500) 


0.834 


0.820 


0.833 


(500) 


2.500 


0.000 


2.500 


x 2 (500) 


0.833 


0.000 


0.833 



Values of decision variables under the Max-Weight policy 



MW x 2 (t) 



100 
iteration 



Values of decision variables under the Quadratic policy 



QD x l (t) 
QD x 2 (t) 



\Z(r+l) 



\Z(r)\\ 2 <0. 



100 
iteration 



Fig. 8. Comparison between max-weight and quadratic policies for solving 
linear programming 



Since Ej=i ^j( T ) = ll-^( T )ll 2 ' ^e above inequality implies 
that the value of EjLi-^f( T ) ^ s not increased in the next 
iteration. Therefore, the value of each Zj(t) is bounded by, 
for all r > 0, 

T r ( (max) (e) \ , 7-, 

Zj (r) < ^ V ° I + £ M^- 

i=l 

Dividing by r: 

T 6T T 

When V = 1/s and r = 1/e 2 , it follows that 

Mr) < l l £ X (^ aX) - + E . EIL: 
t ~ e/e 2 1/e 2 

= 0(e). 

From d5TK it follows that 

n 

i=i 

Thus, under Assumption [2] using 0(l/e 2 ) iterations ensures 
the time-averaged value of y~o(t) is within 0(e) of the optimal 
value J/q , and all constraints are within 0(e) of being 
satisfied. This is the 0(l/e 2 ) tradeoff between computation 
and accuracy. 



E. Example 

For an example, we solved a small linear programming 
problem by using the max-weight and quadratic policies. The 
problem is 

Maximize 2x\ + X2 
Subject to xi + X2 < 4 

5xi + 3x2 < 15 
xi < 2.5 

< x x < 10 

< x 2 < 10. 

The solution of this problem is x\ — 2.5, X2 — 0.833. For 
both policies, the parameters are V = 200 and the number of 
iteration is 500. The values of decision variables Xi(t) from 
both policies are shown in figure [8] The numerical values are 
show in table J] 

These time-averaged values of decision variables from both 
policies approach the optimal solution. If number of iteration 
is increased, the precision is increased. Interestingly, the 
quadratic policy has a smooth property, as shown in Fig. [8] and 
that the intermediate decision values converge to an optimal 
solution before the time-averaged values does. 

VII. Conclusion 

We studied information quality maximization in a system 
with uplink and single-hop relay capability which was done 
by designing queuing dynamic. From Lyapunov optimization 
theory, we proposed a novel quadratic policy having a sep- 
arable property, which leads to a distributed mechanism of 
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format selection. In comparison with the standard method, 
max-weight policy, our policy leads to an algorithm that 
reduces queue backlog by a significant constant. This reduction 
also propagates and grows with the number of queues in the 
system. We simulated the algorithm to verify correctness and 
behavior of the new policy. In addition, we shows how the 
novel policy is applied to solve linear programs. 
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